Efficiently Processing of Top-k Typicality Query for Structured Data

نویسندگان

  • David C. Wyld
  • Jaehui Park
  • Sang-goo Lee
چکیده

This work presents a novel ranking scheme for structured data. We show how to apply the notion of typicality analysis from cognitive science and how to use this notion to formulate the problem of ranking data with categorical attributes. First, we formalize the typicality query model for relational databases. We adopt Pearson correlation coefficient to quantify the extent of the typicality of an object. The correlation coefficient estimates the extent of statistical relationships between two variables based on the patterns of occurrences and absences of their values. Second, we develop a top-k query processing method for efficient computation. TPFilter prunes unpromising objects based on tight upper bounds and selectively joins tuples of highest typicality score. Our methods efficiently prune unpromising objects based on upper bounds. Experimental results show our approach is promising for real data.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Efficiently Answering Top-k Typicality Queries on Large Databases

Finding typical instances is an effective approach to understand and analyze large data sets. In this paper, we apply the idea of typicality analysis from psychology and cognition science to database query answering, and study the novel problem of answering top-k typicality queries. We model typicality in large data sets systematically. To answer questions like “Who are the top-k most typical N...

متن کامل

TopX: efficient and versatile top-k query processing for text, structured, and semistructured data

TopX is a top-k retrieval engine for text and XML data. Unlike Boolean engines, it stops query processing as soon as it can safely determine the k top-ranked result objects according to a monotonous score aggregation function with respect to a multidimensional query. The main contributions of the thesis unfold into four main points, confirmed by previous publications at international conference...

متن کامل

Overview of Top-k Query Processing in Relational Databases

Query processing is a fundamental part of Database management system. As the amount of text data stored in relational databases is increasing, it is necessary to support the Top-k query processing over text data. The main objective of top-k query processing is to return the k highest ranked results quickly and efficiently. In this paper, we introduce the Top-k query processing in relational dat...

متن کامل

Keyword Search over Graph-structured Data for Finding Effective and Non-redundant Answers

In this paper, we propose a new method for keyword search over large graph-structured data to find a set of answers which are not only relevant to the query but also reduced and duplication-free. We define an effective answer structure and a relevance measure for the candidate answers to a keyword query on graph data. We suggest an efficient indexing scheme on relevant and useful paths from nod...

متن کامل

Guest Editors Introduction: Special Section on Keyword Search on Structured Data

WITH the prevalence of Web search engines, keyword search has become the most popular way for users to retrieve information from text documents. On the other hand, there is an enormous amount of valuable information stored in structured form (relational or semistructured) in Internet, intranet, and enterprise databases. To query such data sources, users traditionally depended on specialized app...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2013